Editor’s note: Faina Shmulyian is VP, data sciences, and Sheilah Wagner is director, data sciences, at ENGINE Insights. This is an edited version of a post that originally appeared under the title, “Using segmentation to understand customers.”
How to drive results using segmentation
Companies need to be nimble and focus marketing efforts and dollars where needed, when needed. Segmentation can direct those efforts and help a company know its customers and potential customers. Following the proper steps, focusing on the most applicable segments and utilizing modern segmenting techniques (e.g., biclustering, NMF, ensemble) can help to ensure the most actionable results.
Modern segmentations are expected to be rich, detailed, informative and simultaneously actionable and efficient in driving marketing tactics. Therefore, the right choice of clustering techniques is essential in developing high-quality segmentations for our clients.
How segmentation benefits business
Segmentation analysis can address several business objectives. To promote a product or service, it is often more effective to target a subgroup of potential customers with similar interests using specific messaging versus spreading one message that may only appeal to a fraction of potential consumers. Additionally, results can be used to better understand consumers, so messages are relevant and perceived as being for “people like them.”
When the goal is new product development, concepts, products and claims can be specifically developed that offer benefits that various groups seek. In addition to revealing unmet needs among groups of customers, identifying areas in which competition is low is another advantage that enables opportunities for success.
With targeted segmentation, the groups can be mapped to a company’s database so they can be partitioned and utilized for targeted promotional campaigns. Ultimately, segmentation allows a company to be more agile and focus efforts where needed.
Six steps in the segmentation process
- Clearly define the purpose for segmenting, as the structure of the survey will vary based on the business objectives. Utilizing qualitative research can be instrumental in uncovering consumer needs and informing attribute development and language that is familiar to consumers.
- Identify the market or population to be targeted. If the target population is homogeneous to start, then few differences may exist to partition groups different from others. So, a broad target is best to allow more groups to emerge.
- Create and evaluate the usability of the segments. Segments should be identifiable, reproducible, accessible, actionable and large enough to be profitable.
- Name the segments to help people identify with and get to know groups, making implementing marketing strategies easier.
- Execute a work session to review details and pros/cons of potential segmentation schematics and discuss marketing potential of each segment with clients. This is imperative to determine the best final solution for the client’s purpose.
- Once a final segmentation solution is determined develop reporting. Creating a typing tool is useful to identify segments in future research.
Focus on segments that best address business issues
It is important to work with internal or external clients throughout the process to find the segmentation that works best for the client’s purpose(s). While targeted marketing is beneficial, it is also costly to create different marketing actions to differing segments; therefore, focusing on fewer, more lucrative groups is often advantageous. For instance, combining two segments that are not significantly different in some areas can be effective.
Also, if a segment is complicated or costly to reach, or if the segment is too small to be worthwhile, it is better to focus on other more viable groups. The final solution and names must resonate with the client as they need to champion the targeted marketing within the company.
Three modern segmentation approaches
In a large segmentation there is a need to identify and profile segments on many variables. High dimensionality can cause a lot of random variability from irrelevant variables. This can cause difficulty in finding an effective cluster solution utilizing traditional segmentation techniques. Also, it might make the solution unstable, lessening reproducibility and confidence in finding the best result. Below are various modern techniques we recommend for segmentations utilizing high dimensional and fragmented data.
Biclustering, or co-clustering, was first investigated by Hartigan in 1972, but only gained its popularity after it was applied to gene expression data by Cheng and Church in 2000. Today, biclustering is used across many areas such as biomedicine, text mining and marketing.
Biclustering is a statistical learning technique that attempts to find homogeneous partitions of rows and columns of a data matrix. Applied to segmentations, it simultaneously groups respondents and variables explicitly utilizing relationships between the entities.
Usually in a segmentation, respondents are clustered upon relational similarities of their reactions to a set of variables. Biclustering identifies homogeneous “cells” of respondents and variables rearranging the data matrix into a checkerboard-like structure. For many high dimensional segmentations, factor analysis is a necessary preparational step before any clustering technique is applied. In biclustering, factor analysis is imbedded into the algorithm, providing dimensionality reduction and a better understanding of underlying variables driving the segmentation. Biclustering is segmenting respondents on what is important to them.
Non-negative matrix factorization (NMF) is an algorithm first introduced by Lee and Seung in 1999 to learn parts-based representations of objects, which concretely referred to facial features of face images and semantic features of text.
Just as in other matrix factorization techniques such as principal components analysis and independent components analysis, the NMF algorithm decomposes a matrix into two smaller non-negative matrices, which approximate the original matrix when multiplied.
In a consumer segmentation, NMF will bundle attributes/variables into a set of canonical patterns and then each consumer will be described by a sparse map representing a combination of a few patterns. These patterns are used to link consumers to segments.
NMF is especially suitable for segmentations on large sparse data sets and for segmentations involving data fusion. It is robust with data of different types, including numerical and categorical data.
Cluster ensemble or consensus clustering analysis is a computationally intense data mining technique representing a recent advance in unsupervised learning analysis (Strehl and Gosh, 2002). It has been suggested as a generic approach for improving the accuracy and stability of base clustering algorithm results.
Every cluster analysis produces clusters, whether there is any underlying structure in the data or not. The fact that a solution seems reasonable does not guarantee that the results would be reproducible with a different sample of customers. Ensemble algorithms stabilize segmentation solutions and improve reproducibility.
Cluster ensemble analysis begins by generating multiple cluster solutions using a collection of base learner algorithms. These algorithms could be as basic as k-means or more advanced as Latent Class, NFM or biclustering. It next derives a consensus solution that is more robust and of higher quality than any of the individual ensemble members used to create it. Cluster ensemble solutions exhibit lower sensitivity to noise, outliers and sampling variations.